Byte Pair Transformation using Zero-Frequency Bytes with Varying Number of Passes
نویسندگان
چکیده
Byte pair encoding (BPE) algorithm was suggested by P. Gage is to achieve data compression. It encodes all instances of most frequent byte-pair using zero-frequency byte in the source data. This process is repeated for maximum m possible number of passes until no further compression is possible, either because there are no more frequently occurring byte pairs or there are no more unused zero-frequency bytes to represent pairs. It writes out substitution information before the encoded data in each pass. This algorithm is very time consuming as it requires to determine most frequent byte-pair in each pass before starting substitution. We have proposed kpass byte-pair transformation algorithm where k may be very very small as compared to maximum possible passes m. Our aim is to minimize the compression time and achieve equvivalent compression rate. Proposed algorithm transforms half of the possible most-frequent byte pairs in each pass except the last. In the last pass, it transforms all remaining possible byte-pairs. This reduced number of passes save the time taken in computing frequency of byte-pairs in maximum m passes. Experimental results have shown that proposed algorithm had taken 3.213, 9.794, 13.324, 16.323, 22.388 seconds with 1, 2, 3, 4 and 6 passes respectively as compared to 295.642 seconds of m-passes. Compression rate achieved due to transformation is 14.72%, 20.12%, 21.89%, 22.67% and 22.96% with 1, 2, 3, 4 and 6 passes respectively as compared to 25.55% using maximum m-passes. As the number of passes increases, compression is better with increased execution time. Our aim of achieving speed is achieved with little loss in compression rate. General Terms Data Compression, Algorithms
منابع مشابه
Quad-Byte Transformation using Zero-frequency Bytes
Byte pair encoding (BPE) algorithm was suggested by P. Gage is to achieve data compression. It encodes all instances of most frequent byte-pair using zerofrequency byte in the source data. This process is repeated for maximum m possible number of passes until no further compression is possible, either because there are no more frequently occurring byte pairs or there are no more unused zero-fre...
متن کاملAchieving Better Compression Applying Index-based Byte-Pair Transformation before Arithmetic Coding
Arithmetic coding is used in many compression techniques during the entropy encoding stage. Further compression is not possible without changing the data model and increasing redundancy in the data set. To increase the redundancy, we have applied index based byte-pair transformation (BPT-I) as a pre-processing to arithmetic coding. BPT-I transforms most frequent byte-pairs (2-byte integers). He...
متن کاملA Collision Attack on a Double-Block-Length Compression Function Instantiated with 8-/9-Round AES-256
f0(h0∥h1,M) = Eh1∥M(h0) ⊕ h0 , f1(h0∥h1,M) = Eh1∥M(h0 ⊕ c) ⊕ h0 ⊕ c , where ∥ represents concatenation, E is AES-256 and c is a 16-byte nonzero constant. The proposed attack is a free-start collision attack using the rebound attack proposed by Mendel et al. The success of the proposed attack largely depends on the configuration of the constant c: the number of its non-zero bytes and their posit...
متن کاملNon-uniform Error Protection for Wavelet Transformed Images
Abstract This paper presents non-uniform error protection techniques based on wavelet transforms, and conducts comparative performance evaluations of uniform and nonuniform error protection. Wavelet transformation of an image consists of the decomposition of the image into a number of subbands, where each subband represents information in a particular frequency band. The choice of the wavelet i...
متن کاملCODING - Stream Cipher Methods by Varying Components during Ciphering Data
Kernel of the symmetric block ciphering methods presented here is the coupling of XOR operations and of invertible substitution tables S with all possible 256**t byte groups (with t=1, 2, 3, ... Bytes, fixed at the beginning) being derived from keys: K(block) := S(S(block) ⊗ Eo) ⊗ Eu with-Eo upper and Eu lower triangular (byte-group-)matrix with (byte-block-length/t)**2 values, value 1 at all n...
متن کامل